NASA/TM-2014-218303 



Drought Prediction for Socio-Cultural Stability Project 


Christa-Peters Lidard, John B. Eylander, Randall Koster ; Balachandrudu Narapusetty, Sujay Kumar , 
Matt Rodell, John Bolten, David Mocko, Gregory Walker ; Kristi Arsenault, Scott Rheingrover 


April 2014 



NASA STI Program ... in Profile 


Since its founding, NASA has been dedicated to the 
advancement of aeronautics and space science. The 
NASA scientific and technical information (STI) pro- 
gram plays a key part in helping NASA maintain this 
important role. 

The NASA STI program operates under the auspices 
of the Agency Chief Information Officer. It collects, 
organizes, provides for archiving, and disseminates 
NASA’s STL The NASA STI program provides access 
to the NASA Aeronautics and Space Database and its 
public interface, the NASA Technical Report Server, 
thus providing one of the largest collections of aero- 
nautical and space science STI in the world. Results 
are published in both non-NASA channels and by 
NASA in the NASA STI Report Series, which includes 
the following report types: 

• TECHNICAL PUBLICATION. Reports of 
completed research or a major significant phase of 
research that present the results of NASA Programs 
and include extensive data or theoretical analysis. 
Includes compilations of significant scientific and 
technical data and information deemed to be of 
continuing reference value. NASA counterpart of 
peer-reviewed formal professional papers but has 
less stringent limitations on manuscript length and 
extent of graphic presentations. 

• TECHNICAL MEMORANDUM. Scientific 
and technical findings that are preliminary or of 
specialized interest, e.g., quick release reports, 
working papers, and bibliographies that contain 
minimal annotation. Does not contain extensive 
analysis. 

• CONTRACTOR REPORT. Scientific and technical 
findings by NASA-sponsored contractors and 
grantees. 


• CONFERENCE PUBLICATION. Collected 
papers from scientific and technical conferences, 
symposia, seminars, or other meetings sponsored or 
co-sponsored by NASA. 

• SPECIAL PUBLICATION. Scientific, technical, 
or historical information from NASA programs, 
projects, and missions, often concerned with 
subjects having substantial public interest. 

• TECHNICAL TRANSLATION. English-language 
translations of foreign scientific and technical 
material pertinent to NASA's mission. 

Specialized services also include organizing and 
publishing research results, distributing specialized 
research announcements and feeds, providing help 
desk and personal search support, and enabling data 
exchange services. For more information about the 
NASA STI program, see the following: 

• Access the NASA STI program home page at 
http://www.sti.nasa.gov 

• E-mail your question via the Internet to 
help@sti.nasa.gov 

• Fax your question to the NASA STI Help Desk at 
443-757-5803 

• Phone the NASA STI Help Desk at 443-757-5802 

• Write to: 

NASA STI Help Desk 

NASA Center for AeroSpace Information 

7115 Standard Drive 

Hanover, MD 21076-1320 


Available from: 


NASA Center for AeroSpace Information 
7115 Standard Drive 
Hanover, MD 21076-1320 


National Technical Information Service 
5285 Port Royal Road 
Springfield, VA 22161 



NASA/TM-2014-218303 



Drought Prediction for Socio-Cultural Stability Project 


Christa-Peters Lidard 

Goddard Space Flight Center ; Greenbelt, MD 
John B. Ey lander 

U. S. Army ERDC/CRREL, Hanover, NH 
Randall Foster 

Goddard Space Flight Center, Greenbelt, MD 
Balachandrudu Narapusetty 

Science Applications International Corporation, McLean, VA 
Sujay Kumar 

Science Applications International Corporation, McLean, VA 
Matt Rode 1 1 

Goddard Space Flight Center, Greenbelt, MD 
John Bolten 

Goddard Space Flight Center, Greenbelt, MD 
David Mocko 

Science Applications International Corporation, McLean, VA 
Gregory Walker 

Science Applications International Corporation, McLean, VA 
Kristi Arsenault 

Science Applications International Corporation, McLean, VA 
Scott Rheingrover 

Science Applications International Corporation, McLean, VA 


National Aeronautics and 
Space Administration 

Goddard Space Flight Center 
Greenbelt, MD 


April 2014 




Notice for Copyrighted Information 

This manuscript is a joint work of employees of the National Aeronautics and Space Administration, the United 
States Army, and employees of Science Applications International Corporation under Contract #NNG12HP08C 
with the National Aeronautics and Space Administration. The United States Government has a non-exclusive, 
irrevocable, worldwide license to prepare derivative works, publish, or reproduce this manuscript, and allow others 
to do so, for United States Government purposes. 


Trade names and trademarks are used in this report for identification only. Their usage does not constitute an 
official endorsement, either expressed or implied, by the National Aeronautics and Space Administration. 


Level of Review: This material has been technically reviewed by technical management 



Drought Prediction for Socio-Cultural Stability Project 


Abstract 


The primary objective of this project is to answer the question: “Can existing, linked infrastructures 
be used to predict the onset of drought months in advance?” Based on our work, the answer to this 
question is “yes” with the qualifiers that skill depends on both lead-time and location, and especially 
with the associated teleconnections (e.g., ENSO, Indian Ocean Dipole) active in a given 
region/season. 

As part of this work, we successfully developed a prototype drought early warning system based on 
existing/mature NASA Earth science components including the Goddard Earth Observing System 
Data Assimilation System Version 5 (GEOS-5) forecasting model, the Land Information System 
(LIS) land data assimilation software framework, the Catchment Land Surface Model (CLSM), 
remotely sensed terrestrial water storage from the Gravity Recovery and Climate Experiment 
(GRACE) and remotely sensed soil moisture products from the Aqua/Advanced Microwave 
Scanning Radiometer - EOS (AMSR-E). We focused on a single drought year — 2011 — during 
which major agricultural droughts occurred with devastating impacts in the Texas-Mexico region of 
North America (TEXMEX) and the Horn of Africa (HOA). 

Our results demonstrate that GEOS-5 precipitation forecasts show skill globally at 1 -month lead, and 
can show up to 3 months skill regionally in the TEXMEX and HOA areas. Our results also 
demonstrate that the CLSM soil moisture percentiles are a good indicator of drought, as compared to 
the North American Drought Monitor for TEXMEX and a combination of Famine Early Warning 
Systems Network (FEWS NET) data and Moderate Resolution Imaging Spectroradiometer 
(MODIS)’s Normalized Difference Vegetation Index (NDVI) anomalies over HOA. 

The data assimilation experiments produced mixed results. GRACE terrestrial water storage (TWS) 
assimilation was found to significantly improve soil moisture and evapotranspiration, as well as 
drought monitoring via soil moisture percentiles, while AMSR-E soil moisture assimilation produced 
marginal benefits. 

We carried out 1-3 month lead-time forecast experiments using GEOS-5 forecasts as input to 
LIS/CLSM. Based on these forecast experiments, we find that the expected skill in GEOS-5 
forecasts from 1-3 months is present in the soil moisture percentiles used to indicate drought. In the 
case of the HOA drought, the failure of the long rains in April appears in the February 1, March 1 
and April 1 initialized forecasts, suggesting that for this case, drought forecasting would have 
provided some advance warning about the drought conditions observed in 201 1. 

Three key recommendations for follow-up work include: (1) carry out a comprehensive analysis of 
droughts observed over the entire period of record for GEOS-5 forecasts; (2) continue to analyze the 
GEOS-5 forecasts in HOA stratifying by anomalies in long and short rains; and (3) continue to 
include GRACE TWS, Soil Moisture/Ocean Salinity (SMOS) and the upcoming NASA Soil 
Moisture Active/Passive (SMAP) soil moisture products in a routine activity building on this 
prototype to further quantify the benefits for drought assessment and prediction. 
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I. BACKGROUND 


A. Objective 

The primary objective was to address the question: “Can existing, linked infrastructures be used 
to predict the onset of drought months in advance?” by developing a prototype drought early 
warning system based on existing/mature NASA Earth science. 

B. Approach 

Building on previous work demonstrating that drought early warning skill is a mixture of 
forecast skill and land state memory and initial conditions, the rationale for the work was to 
construct a system capable of harnessing all available skill from GEOS-5 seasonal forecasts and 
land state initialization via assimilation of AMSR-E and GRACE through LIS. Our approach 
consisted of 5 tasks that not only connect the components for a prototype global drought early 
warning system, but evaluate the products of the system for the 201 1 droughts in Texas and the 
Horn of Africa. The tasks include 1) evaluating GEOS-5 forecast skill for 2011; 2) conducting a 
long-term (1948-2011) offline spinup with the LIS/Catchment model using observed forcing; 3) 
conducting additional spinups from 2008-forward with AMSR-E and GRACE assimilation; 4) 
conducting forecast experiments using GEOS-5 using the spinups with and without AMSR-E 
and GRACE DA initial land states; and 5) evaluating the skill of the forecasted droughts. 

C. Description of GEOS-5 Model, MERRA and MERRA-Land 

The Global Modeling and Assimilation Office (GMAO) at NASA/GSFC hosts a seasonal 
forecast system consisting of an atmospheric general circulation model (AGCM, Rienecker et al. 
2008) coupled to the Catchment land surface model (Koster et al. 2000) and a full ocean GCM 
imported from the NOAA Geophysical Fluid Dynamics Laboratory (Griffies et al. 2005). The 
Goddard Earth Observing System Model, Version 5 (GEOS-5) is a system of models integrated 
using the Earth System Modeling Framework (ESMF). The GEOS-5 Data Assimilation System 
integrates the GEOS-5 AGCM with the Gridpoint Statistical Interpolation (GSI) atmospheric 
analysis developed jointly with NOAA/NCEP/EMC. The GEOS-5 systems are being developed 
in the GMAO to support NASA's earth science research in data analysis, observing system 
modeling and design, climate and weather prediction, and basic research. As part of GMAO 
quasi-operational activities, the atmosphere, land, and ocean states of the coupled models are 
initialized each month to realistic values, and (by varying the start date) an ensemble of 9-month 
simulations generates the forecasts of meteorological forcing, particularly precipitation, that are 
used in this project. The hourly forecasts are available from 1980-present, at a spatial resolution 
of 1.25 degrees latitude x 1 degree longitude. The experimental forecasts and historical 
performance can be viewed at the Experimental Climate Forecast web page 
(http://gmao.gsfc.nasa.gov/cgi-bin/products/climateforecasts/GEOS5/index.cgi). See 

http://gmao.gsfc.nasa.gov/research/climate/ for more information on the forecast system. 
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The Modern-Era Retrospective Analysis for Research and Applications (MERRA) is a NASA 
reanalysis for the satellite era that utilizes a fixed version of the Goddard Earth Observing 
System Data Assimilation System Version 5 (GEOS-5). The MERRA time period covers the 
modem era of remotely sensed data, from 1979 through the present, and the special focus of the 
atmospheric assimilation is the hydrological cycle. Previous long-term reanalyses of the Earth's 
climate had high levels of uncertainty in precipitation and in precipitation’s interannual 
variability. The GEOS-5 data assimilation system used for MERRA implements Incremental 
Analysis Updates (IAU) to slowly adjust the model states toward the observed state. The water 
cycle benefits as unrealistic spin down is minimized. In addition, the model physical 
parameterizations have been tested and evaluated in a data assimilation context, which also 
reduces the shock of adjusting the model system. Land surface processes are modeled with the 
state-of-the-art GEOS-5 Catchment hydrology land surface model (described below). MERRA 
thus makes significant advances in the representation of the water cycle in reanalyses. 

MERRA output data resemble other global reanalyses, with several key advances, including 
output at frequencies higher than the 6-hourly analyses. Two-dimensional diagnostics (surface 
fluxes, single level meteorology, vertical integrals and land states) are produced at 1-hour 
intervals. These data products and the 6-hourly three-dimensional atmospheric analyses are also 
available at the full spatial resolution (1/2 degrees latitude x 2/3 degrees longitude). Extensive 
three-dimensional 3-hourly atmospheric diagnostics on 42 pressure levels are also available, but 
at coarser (1.25 degree) resolution. 

An improved set of land surface hydrological fields is provided in the supplemental MERRA- 
Land data product. In addition to the above-mentioned improvements, this product benefits from 
NOAA Climate Prediction Center observation-based corrections to the precipitation forcing and 
from revised parameter values in the rainfall interception model, changes that effectively correct 
for known limitations in the MERRA surface meteorological forcings. The MERRA-Land 
products are documented by Reichle et al. (2011) and Reichle (2012), who found that the 
MERRA-Land data appear more accurate than the original MERRA estimates and are thus 
recommended for those interested in using MERRA output for land surface hydrological studies. 
Like MERRA, MERRA-Land analyses are available hourly on a grid with the same spatial 
resolution (1/2 degrees latitude x 2/3 degrees longitude). 

D. Description of Catchment Land Surface Model 

The land surface model being used in this project is the Catchment Land Surface Model (CLSM; 
Koster et al., 2000; Duchame et al., 2000), which calculates the water and energy balance of the 
land surface using either observed or forecasted inputs such as precipitation, radiation, wind 
speed, temperature, humidity and pressure. For this work, we utilize the Fortuna 2.5 version of 
Catchment, which is the version used operationally in GEOS-5. 7. 2 since Aug 2011, and also in 
MERRA-Land. 
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Unlike most LSMs, CLSM does not employ a one-dimensional "layered" framework, using 
instead a framework that emphasizes subgrid spatial variability in moisture. In this LSM, 
subgrid heterogeneity in surface moisture state is treated statistically, since computational 
constraints (now and in the foreseeable future) prevent its explicit resolution. Nevertheless, the 
applied distributions are related sensibly to the topography, which exerts a major control over 
much of the subgrid variability. The approach is illustrated in Figure B.l which shows three 
different levels of the (shallow) water table and the associated partitioning of the surface into 
three regions: (1) a saturated region, from which evaporation occurs with no water stress and 
over which rainfall is immediately converted to surface runoff, (2) a subsaturated region, from 
which transpiration occurs with limited water stress and over which rainwater infiltrates the soil, 
and (3) a “wilting” region, in which the water stress shuts down the transpiration completely. 
The relative areas of these regions, which vary in time, are unique functions of the local 
topography and the values of the Catchment LSM's three water prognostic variables. By 
continually partitioning the catchment into hydrologically distinct regimes and then applying 
different runoff and evaporation physics in the different regimes, the Catchment LSM should, at 
least in principle, produce a more realistic simulation of area-averaged surface energy and water 
processes. 
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Figure 1. Separation of the catchment area into hydrological regimes. 
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The soil water prognostic variables used by the Catchment LSM are “non-traditional” in that 
they are not strictly associated with soil layers. The main variable, the “catchment deficit”, 
describes the equilibrium water table distribution and the associated distribution of the 
equilibrium soil moisture profiles in the overlying vadose zone. The second variable describes 
the degree to which the root zone is out of equilibrium with the catchment deficit, and the third 
describes the degree to which the near-surface moisture is out of equilibrium with the other two 
variables. The water transfer between the three variables and the baseflow flux out of the system 
are controlled in part by the local topography. 

The model's other prognostic variables include an interception reservoir water content, a surface 
temperature, and the heat contents of six subsurface soil layers, from which time-varying vertical 
profiles of soil temperature over several meters can be derived. The model allows explicit 
vegetation control over the computed surface energy and water balances, with environmental 
stresses (high temperatures, dry soil, etc.) acting to increase canopy resistance and thus decrease 
transpiration. Six fundamentally different types of vegetation are considered in the current 
version of the Catchment LSM: broadleaf evergreen trees, broadleaf deciduous trees, needleleaf 
trees, grassland, shrubs, and tundra vegetation. Bare soil evaporation, transpiration, and 
interception loss occur in parallel. The energy balance formulations in the model (again, applied 
separately in each hydrological regime) were derived in large part from the Mosaic land surface 
model (Koster and Suarez, 1996), which in turn borrowed heavily from the SiB model of Sellers 
et al. (1986) for the transpiration calculation. Snow is modeled using three prognostic variables 
(heat content, snow water equivalent, and snow depth) in each of three layers (Stieglitz et al., 
2001). The melting and reffeezing of snow, snow compaction, liquid water retention, and the 
impact of snow density on thermal conductivity and albedo are explicitly treated. 


E. Description of Land Information System 

NASA/GSFC has led the development of a comprehensive land surface modeling and data 
assimilation framework known as the NASA Land Information System (LIS; Kumar et al. 2006, 
Peters-Lidard et al. 2007). LIS provides the modeling and computational capabilities to merge 
observations and model forecasts to generate spatially and temporally coherent estimates of land 
surface conditions. These analyses are of critical importance to applications such as agricultural 
production, water resources management, and prediction of flood, drought, weather and climate. 

LIS includes a comprehensive suite of subsystems to support uncoupled and coupled land data 
assimilation, as shown in Figure 2. The LIS-LSM subsystem includes several community land 
surface models (LSMs), such as the CLSM described above, and supports their application at 
varying spatial and temporal scales, over regional, continental and global domains. 

The LIS Data Assimilation (LIS-DA; Kumar et al., 2008) subsystem supports multiple data 
assimilation algorithms that are focused on generating improved estimates of hydrologic model 
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states. The LIS-DA subsystem includes tools such as the Ensemble Kalman Filter (EnKF), which 
is widely accepted as an effective technique for sequential assimilation of hydrologic variables. 
The EnKF provides a flexible approach for incorporating errors in the model and observations 
and its ensemble-based treatment of errors makes it suitable for handling the modestly non-linear 
dynamics and the temporal discontinuities that are typical of land surface processes. The LIS-DA 
subsystem is uniquely suited for assimilating disparate sources of observations into different land 
surface models. Retrievals of soil moisture from AMSR-E (Kumar et al., 2009; Peters-Lidard et 
al., 2011), skin temperature from ISCCP (Reichle et al., 2009), and snow water equivalent and 
snow cover from AMSR-E and MODIS (Yatheendradas, et al., 2012; Liu et al., 2013) have been 
assimilated into land surface models using LIS-DA. In addition, recently funded work will 
extend LIS-DA to allow direct radiance assimilation to overcome the limitations of the retrieval 
products. 
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Figure 2. Schematic of LIS showing the three subsystems of LIS described in the text, LIS-LSM, LIS-DA and LIS- 
OPT/UE. 

The newest version of LIS (version 7 or LIS7), which is the version used for this project, 
includes a suite of subsystems for optimization (LIS-OPT) and uncertainty estimation (LIS-UE). 
These subsystems were demonstrated for satellite OSSEs related to soil moisture estimation 
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(Kumar et al. 2012a). Through the uncertainty estimation tools, the remotely sensed observations 
were used to quantify the uncertainty in model parameters and model predictions (Harrison et al. 
2012). Overall, the integrated modeling, multi-scale resolution, ensemble run capabilities, 
inclusion of algorithms for exploiting space-based observations, and verification capabilities 
uniquely position LIS to serve as a platform for data assimilation experiments, including those 
used in this project. 

LIS7 utilizes a standard preprocessing toolkit known as the Land Data Toolkit (LDT) and relies 
heavily on an open-source post-processing analysis software package (Land surface Verification 
Toolkit (LVT); Kumar et al. 2012b). 

II. TECHNICAL ACCOMPLISHMENTS 


A. GEOS-5 Forecast Skill 

The underlying premise of this work is that early warning skill for agricultural drought (deficits 
in soil moisture) is a mixture of meteorological forecast skill and land state memory combined 
with accurate initial conditions. To evaluate meteorological forecast skill for drought early 
warning applications, the most important forecast variable is precipitation. Hence, this section 
summarizes our findings with respect to GEOS-5 precipitation forecast skill for the HOA and 
TEXMEX domains. 



percentile 

Figure 3. Histogram showing the decile of forecasted JJA precipitation (i.e., 0-10 = driest decile) for those locations and 
times when the observed JJA precipitation is in its lowest decile. To avoid noise associated with uncertain precipitation 
observations, a weighting defined by precipitation gauge density is applied to the data prior to binning. 


First, though, we provide in Figure 3 a global look at how well the GEOS-5 system predicts 
precipitation deficits. The histogram, which shows data for 3 -month (June- August, or JJA) 
precipitation forecasts at zero lead covering the period 1981-2012, was constructed as follows. 
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Thirty-two years of NOAA PREC/L precipitation observations were analyzed at a resolution of 
5°x5°, and at each location, years for which the JJA precipitation lay in the lowest decile of 
observed values at that location were determined. We then examined the ensemble mean 
forecasts of JJA precipitation. We determined, for each instance of low observed JJA 
precipitation, the decile in which the corresponding forecasted precipitation fell. The histogram 
shows the distribution of these deciles. A perfect model performance would put all the counts in 
the first (0-10) decile; as can be seen, the forecasting system is far from perfect. (Seasonal 
precipitation prediction is a notoriously difficult and universal problem. Skill with our system 
for temperature prediction, not shown, is significantly higher.) There is nevertheless some 
tendency for the model to predict drier-than-average conditions when it is supposed to; this is the 
information we want to exploit in this project. 


Initialization 
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Feb. 

forecast Initialization 

■ on March 1 


March 

forecast 
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forecast 
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forecast 


June 

forecast 



Observations 

(GPCP) 



Figure 4. Monthly ensemble mean GEOS-5 precipitation anomaly forecasts over the Horn of Africa for different 
initialization dates and lead times. Observed anomalies appear in the rightmost column. 

The skill of the GEOS-5 system in predicting the 201 1 Texas drought is discussed in an online 
GMAO report (http://gmao.gsfc.nasa.gov/research/climate/US_drought/); in short, predicted dry 
conditions tend to be displaced eastward from Texas, though the Texas heat wave itself is 
reasonably well predicted. A corresponding analysis of the system’s performance in predicting 
the 2011 Horn of Africa drought is provided in Figure 4. The model shows some slight skill in 
predicting this drought - at least some of its large-scale features if not the specific details of its 
structure - even at 1 -month lead (for the February 1 initialization). The forecasted drought, 
however, is generally much less severe than the actual drought. This, of course, may represent 
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deficiencies in the forecast system. It must certainly also reflect, however, the unavoidable fact 
that the forecast results represent an ensemble average, whereas the observations represent a 
single “realization” of weather - plots for some of the individual ensemble members will look 
much more realistic. Differences may also reflect deficiencies in the observational record over 
the area. 
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Figure 5. Correlation coefficients between the observed and GEOS-5 forecasted Dipole Mode Index (DMI) at different 
lead times and for different starting months. The DMI is defined as the difference in Sea Surface Temperature (SST) 
anomalies between (i) 50°-70°E; 10°S-10°N and (ii) 90°-110°E; 10°S-Eq). 5% significant correlations are shown by the 
solid black lines. 


Recent work (Behera et al., 2005; Saji et al., 1999) suggests that the Indian Ocean Dipole 
exhibits a strong control on the precipitation in the HO A region. Behera et al., found significant 
correlation with the short rains (Oct-Nov) and Saji et al., found correlation with the long rains 
(Apr-May). To investigate whether the GEOS-5 model captures this sea surface temperature 
anomaly, we first examined the ability of the model to forecast the Dipole Mode Index (DMI) at 
different lead times and starting months (Figure 5). As this figure shows, the GEOS-5 model 
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successfully forecasts this feature at lead times of 2-3 months and often much more than 3 
months for start months of May through September. 

To further quantify the skill of GEOS-5 in representing this teleconnection, we examine the 
correlation coefficients between GEOS-5 forecasted and observed 1- and 2-month aggregated 
precipitation anomalies over HOA for 1981-2012. As shown in Figure 6, the skill for the long 
rain months is generally small, but the skill for the short rain months ranges from 2-3 months. 
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Figure 6. Correlation coefficients between GEOS-5 forecasted and observed 1- and 2-month aggregated precipitation 
anomalies over HOA for 1981-2012. 5% significant correlations are shown with a solid black line. 

Finally, the probabilistic skill of the forecasts is evaluated using reliability diagrams (Wilks 
1995). Reliability diagrams explain a model’s ability to forecast extreme events based on the 
probability in capturing an observed extreme event among the model ensembles. The 
information provided by reliability diagrams is two-fold. First, it reveals the probability that a 
group of model ensembles forecast a particular event, and secondly, the probability obtained by 
the first step is compared against the observational occurrence of the event to understand the 
forecast reliability. The synthesis of these two steps provides a basis for the probabilistic 
verification skill in the reliability diagrams. The reliability diagrams can be studied on gridded 
data (Bamston et al. 2003) as well as on area specific indices such as Nino 3.4 (Saha et al. 2006). 
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For the Horn of Africa (HO A) case shown in Figure 7, reliability diagrams are constructed for 1- 
month lead forecasts obtained from each grid point. The reliability of the forecasts is studied 
optimally based on five forecast probability bins falling in the 0-1/7, 1/7-2/7, 211-311, 311-All, 
All-511, 511-611, 611-111, and 1 probability categories. Figures 7a and 7b show the model 
forecasts’ reliability in reproducing lowest decile and lower half of the precipitation, 
respectively. In each of the subplots of figure 7, the right-side colu mn of histograms show the 
number of samples that fall into the probability bins for summer (J-J-A), fall (S-O-N), winter (D- 
J-F) and spring (M-A-M) seasons. 

GEOS-5 Precip Reliability Diagram for 1981-2012 HOA 


(a) 1-month lead: Lowest Decile 


(b) 1-month lead: Lower Half 



Figure 7. Reliability diagram for (a) lowest decile and (b) lower half of the GEOS-5 1-month lead precipitation forecasts 
over HOA. 

In our reliability diagrams, the condition for the forecast probability versus observation 
frequency is that the GEOS5 precipitation forecasts are falling in the lowest decile concurrent 
with observations following the same. The grey diagonal line in all the plots indicates that model 
forecasts predict exactly the observed event, and should be considered as a reference for perfect 
reliability. Curves above the perfect reliability indicate that the model gives false alarms, where 
as curves located below denote that the model underpredicts the events. Generally, a flat 
reliability curve compared to the ideal-scenario grey line indicates over-confident and persistent 
nature of forecasts (Bamston et al. 2003). The GEOS5 ensemble forecasts generally over- 
predicted the lowest decile of the precipitation compared to observations (figure 7a), however, 
the lower half of the precipitations are more closely reproduced with respect to observations 
(figure 7b). 

B. Long-term LIS/CLSM Spinups and Open Loop 

As described above, CLSM is embedded in the LIS framework to create “hindcasts”, which are 
retrospective simulated forecasts, of hydrological conditions related to drought. Two domains 
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that experienced drought in 2011 are considered in this work: the Horn of Africa (HOA: 
21.25°E-51.25°E; 1 1.25°S-23.75°N) and TEXas-MEXico (TEXMEX: 1 1 1.25°W-86.25°W; 
18.75°N-41.25°N) regions, as shown in Figure 8. 



Figure 8. CLSM Catchments for the HOA (left) and TEXMEX (right) domains. 

However, in order to initialize these hindcasts, the land surface model must first be run for a 
long-term period in order for slowly evolving land surface states such as soil moisture profiles 
and water table depths to reach equilibrium conditions that are suitable for use in simulated 
forecast experiments. The process of running a land surface model for a long period of time 
using observed precipitation inputs is known as “spinup”, and requirements and recommended 
methodologies for spinup have been discussed by Cosgrove et al., (2003) and Rodell et ah, 
(2005), among others. 

For this project, we utilized the same spinup methodology as MERRA-Land. First, we conduct 
two 31 -year spinups (1980-2011) using the MERRA-Land meteorological data as inputs or 
“forcings” for LIS/CLSM; the system is restarted on 1 January 1980 when the end of 31 
December 2011 is reached, using the 31 December 2011 land states. Once the 62-year spinup is 
complete, we run a final 31 -year “open loop” run starting 1 January 1980, where the term “open 
loop” refers to a LIS/CLSM run after spinup and without any data assimilation. This “open 
loop” run is used to establish the LIS/CLSM climatology that is used both for data assimilation 
and for calculation of drought-related indices. 

Below, we describe the experimental setup for the spinup and open loop runs for both the HOA 
and TEX-MEX domains. 

1. Experiment Setup 

The horizontal spatial grids for the Horn of Africa and TEXas-MEXico domains are both 0.25°, 
which roughly correspond to the 25km resolution of the remotely sensed soil moisture data. In 
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addition to the catchments shown in Figure 9, the experiment setup requires soil, land 
cover/vegetation and geological parameters. 



Figure 9. Land cover type, soil type, May Leaf Area Index, and depth to bedrock for the HOA domain. 

In the case of the HOA domain, these input parameters are shown in Figure 9, and Figure 10 
shows these parameters for TEXMEX. The CLSM parameters for the Fortuna2.5 version used in 
MERRA-Land are described in Reichle et ah, (2012), and include the following sources: 

• Topography: GTOPO30. 

• Soils: NGDC soil textures at l/12th-degree by l/12th-degree. 

• Landcover: IGBP classified using SiB-2 land cover classification data at 1-min. 

• LAI: AVHRR climatology developed for GSWP-2 at 1 -degree, monthly. 

• Depth to bedrock: GSWP-2 data at 1 -degree. 
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Based on recent work over the region by co-Is Rodell and Bolten, the depth to bedrock in our 
simulations is set approximately 2m deeper compared to that in MERRA-Land. This allows the 
water table to properly evolve and incorporate GRACE data without hitting unrealistic bounds 
during extreme wet or dry events, such as the 201 1 drought. Due to the change in bedrock depth 
along with the difference in spatial resolution for the LIS/CLSM runs relative to MERRA-Land, 
a completely new CLSM parameter set was derived for this work using the GMAO preprocessor 
for CLSM-Fortuna 2.5. 
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Figure 10. Same as Figure 9 but for TEXMEX domain and August instead of May LAI. 


The LIS7/CLSM-f2.5 spinups were performed on a 'a degree horizontal resolution grid and at 
20-minute time steps using the procedure described above. The MERRA-Land meteorological 
forcing that was at a resolution of 1/2 degrees latitude x 2/3 degrees longitude and hourly was 
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bilinearly interpolated in space and time onto the !4 degree x 'A degree grid to perform the 
LIS7/CLSM-f2.5 spinups and open loop run. 

2. Evaluations 

In this section, we present our initial evaluation of the open loop runs over HO A. The objective 
of the initial evaluation was to confirm that the simulations were working correctly prior to 
initiating the data assimilation runs. The full results of our evaluation of the HOA and TEXMEX 
open loop runs are given in subsequent sections so that they can be directly compared with the 
data assimilation results. 

For the HOA, the first step in our initial evaluation was to verify that the forcing data from 
MERRA-Land were properly read in and interpolated onto the 'A degree grid. Figure 11 shows 
sample comparisons for precipitation, downward shortwave radiation, winds, and specific 
humidity. We also examined other forcing variables including longwave radiation and pressure, 
both as 2-d difference plots and time series plots. All these comparisons show only minor 
differences due to interpolation from the coarser MERRA-Land grid to the grid used in this 
project. Hence, we can conclude that the forcing processing for both the spinups and open loop 
runs worked properly. 

The second step in our initial evaluation included a comparison of open loop outputs to the 
MERRA-Land outputs. Figure 12 shows sample outputs for evapotranspiration, surface runoff, 
subsurface runoff, and surface soil moisture. Similar to the input forcing outputs, these outputs 
suggest that there are only minor differences, due primarily to regridded forcing as well as 
parameters. A particular parameter difference that we investigated was the impact of the 2m 
increases in bedrock depth in our simulations relative to MERRA-Land. To assess the impact of 
this change, we calculated the error in the open loop Terrestrial Water Storage, which is the sum 
of groundwater, soil moisture and snow (if present), before and after lowering the bedrock depth. 
The error was calculated assuming GRACE TWS is truth. The results (not shown) indicated 
only small regions of significant change in error, hence we concluded that the benefits of the 
increase outweighed any costs. 
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Figure 11. Comparison of forcings from MERRA-Land with interpolated forcings used in the open loop run for HO A. 
The forcings shown are (upper left) precipitation, (upper right) downward shortwave radiation, (lower left) wind speed, 
and (lower right) specific humidity. 
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Figure 12. Comparison of MERRA-Land outputs and the equivalent outputs from our open loop run. The outputs shown 
are (upper left) evapotranspiration, (upper right) surface runoff, (lower left) subsurface runoff (baseflow), and (lower 
right) soil moisture. 
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C. Data Assimilation 

1. AMSR-E Soil Moisture 

Whereas in-situ measurements of soil moisture are very accurate at the local scale, achieving 
accurate regional soil moisture estimates derived solely from point measurements is difficult 
given the need for a dense gauge network and for the proper upkeep of these instruments, which 
can be costly. Microwave remote sensing is the only technology capable of providing timely 
direct measurements of regional soil moisture in areas that are lacking in-situ networks, such as 
the Horn of Africa region. Soil moisture remote sensing technology is well established has been 
successfully applied in many fashions to Earth Science applications (Schmugge, 1985; Jackson, 
et al., 1982; Bolten et al., 2003) The basis for soil moisture retrieval from microwave 
measurements is made possible due to the large contrast between the dielectric constants of dry 
soil (~4) and water (—80). This contrast results in a broad range in the dielectric properties of 
soil-water mixtures (4 - 40), and is the primary influence on the natural microwave emission 
from the soil (Schmugge, 1985). Since the microwave emission from the soil surface is 
dependent upon the moisture content within the soil, when combined with physically-based 
models of the land surface via data assimilation, it is possible to derive accurate regional 
estimates of the soil column water content from microwave brightness temperature observed 
from satellite-based remote sensing instruments. In this project, we applied satellite-based 
estimates of soil moisture dynamics to improve the predictive capability of an optimized 
hydrologic model. 

The soil moisture retrieval algorithm used in this study is the Land Parameter Retrieval Model 
(LPRM) - a radiative transfer-based approach to derive global land surface moisture and 
vegetation optical depth from satellite observations of microwave brightness temperature (Owe 
et al., 2008; De Jeu and Owe, 2003; Meesters et al., 2005). Brightness temperature measured 
from space contains information on both the canopy and soil emissions and their respective 
physical temperatures. Polarization ratios, such as the Microwave Polarization Difference Index 
(MPDI), are frequently used to remove the temperature dependence, resulting in a parameter that 
is quantitatively and more highly related to the dielectric properties of the emitting surface(s). 
The MPDI is mainly a function of the overlying vegetation, and consequently a good indicator of 
the canopy density. However, at 6.9 GHz, the MPDI will not only contain information on the 
canopy, but will also contain significantly more information on the soil emission and 
consequently the soil dielectric properties. This approach is based, in part, on the theoretical 
relationship between the MPDI, vegetation optical depth, and the soil dielectric constant, and is 
described in Owe et al. (2001) and Meesters et al. (2005). The latter reference describes an 
analytical solution to this relationship, which improves the accuracy and overall efficiency of the 
retrieval algorithm, while also allowing one to modify model parameters such as surface albedo. 

The LPRM retrieval methodology subsequently uses a nonlinear iterative optimization procedure 
in a forward modeling approach to partition the natural microwave emission from the Earth’s 
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surface into its primary source components, i.e. the soil surface and the vegetation canopy. The 
model optimizes on the canopy optical depth and the soil dielectric constant. Once convergence 
between the calculated and observed brightness temperatures is achieved, the model uses a 
global data base of soil physical properties together with a soil dielectric mixing model (Wang 
and Schmugge, 1980) to solve for the surface soil moisture. No field observations of soil 
moisture, canopy biophysical properties, or other observations are used for calibration purposes, 
making the model largely physically-based and applicable at any microwave frequency suitable 
for soil moisture monitoring. Land surface temperature is also derived from high frequency 
satellite microwave measurements with a separate retrieval model. The application of the LPRM 
to AMSR-E observations is ideal for the TEXMEX and HOA regions because of the ability to 
apply both 6.9 GHz and 10.7 GHz, and its demonstrated sensitivity to soil moisture in the region 
(de Jeu et al., 2008). 

We applied near-daily surface soil moisture estimates derived from the satellite-based Advanced 
Microwave Scanning Radiometer (AMSR-E). The AMSR-E instrument on the NASA EOS Aqua 
satellite provides global passive microwave measurements of terrestrial, oceanic, and 
atmospheric variables for the investigation of global water and energy cycles (Njoku et al. 2003). 
The satellite follows a sun-synchronous orbit with equatorial crossing at approximately 1330 
LST. The instrument measures brightness temperatures at six frequencies, 6.92, 10.65, 18.7, 
23.8, 36.5, and 89.0 GHz, with vertical and horizontal polarizations at each frequency, for a total 
of twelve channels at an Earth incidence angle of 54.8°. With a fixed incidence angle of 54.8° 
and an altitude of 705 km, AMSR-E provides a conically scanning footprint pattern with a swath 
width of 1445 km. The mean footprint diameter ranges from 56 km at 6.92 GHz to 5 km at 89 
GHz. The AMSR-E revisit coverage is obtained nearly every two days at the equator, 
independently for ascending and descending passes, and more frequently at higher latitudes. For 
the East Africa and Texas regions, the ascending and descending overpasses occur at 
approximately 130 and 1330 local time. 

In the Catchment Land Surface Model (CLSM) [Koster et al., 2000], the vertical soil moisture 
profile is determined through deviations from the equilibrium soil moisture profile between the 
surface and the water table. Soil moisture in the 0-2 cm surface layer and in the 0-100 cm root 
zone layer is diagnosed from the modeled soil moisture states. The catchment LSM typically 
employs hydrologically defined catchments (or watersheds) as basic computational units. In this 
study, however, the catchment LSM is used on a regular latitude-longitude grid to facilitate the 
model intercomparison. 

To integrate the AMSR-E observations with the NASA LIS CLSM, a data assimilation technique 
was used which applies auto-recursive analyses to optimally merge model estimates with state 
observations. The reduction in model uncertainty is achieved by taking advantage of model state 
temporal constancy restraints and model physical properties. Specifically, a 1 -dimensional 
Ensemble Kalman filter (EnKF) was applied. The EnKF is a nonlinear extension of the standard 
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Kalman filter and has been successfully applied to land surface forecasting problems (Reichle et 
al., 2002). Within the filter, sequential ensembles of stochastically perturbed model trajectories 
are corrected towards an observation of model state when available. All the forcing and state 
perturbation parameters are given in Table 1. Our particular implementation of the EnKF 
integrates soil moisture observations from AMSR-E with the CLSM using a 1 -dimensional 
EnKF at 20-minute model time-steps when AMSR-E observations are available. However, 
before the AMSRE soil moisture retrievals can be assimilated, the modeled and observed 
(AMSRE) data must be scaled to a common climatology to reduce potential biases and 
differences in dynamic range that commonly exist between modeled and observed surface soil 
moisture products. By removing time-invariant biases from the observation data, the two datasets 
can be optimally merged to allow more efficient assimilation (Reichle, et al., 2004). The removal 
of multiplicative and additive errors in this way also provides an objective basis for the 
comparison of soil moisture anomalies and a basis for properly validating the system. 

Table 1. Parameters for perturbations to meteorological forcings and soil moisture prognostic model variables in the data 
assimilation integrations. 


Variable 

Perturbation 

Type 

Standard 

Deviation 

Cross Correlations with perturbations 
in Meteorological Forcings 


Downward 

Shortwave 

Downward 

Longwave 

Precipitation 

Downward 

Shortwave 

Multiplicative 

0.3 [-] 

1.0 

-0.5 

-0.8 

Downward 

Longwave 

Additive 

50 Wm" 2 

-0.5 

1.0 

0.5 

Precipitation 

Multiplicative 

0.5 [-] 

-0.8 

0.5 

1.0 

CLSM Soil moisture states 




catdef 

sfexc 


Catchment 

deficit 

(catdef) 

Additive 

0.05 mm 

1.0 

0.0 

Surface 

excess 

(sfexc) 

Additive 

0.02 mm 

0.0 

1.0 
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The EnKF employed an ensemble size of 12, with perturbations applied to both the 
meteorological fields and model prognostic fields to simulate uncertainty in the soil moisture 
fields at 'A degree grid resolution. The parameters used for these perturbations are listed in Table 

I , which are based on earlier data assimilation studies (Kumar et al., 2009). As algorithms such 
as EnKF are designed to correct random, zero-mean errors and assume the use of unbiased 
observations relative to the model generated background, it is often a common practice to scale 
the observations prior to data assimilation to match the model’s climatology (Reichle and Koster, 
2004, Reichle et al., 2007, Kumar et al., 2009). Here we employed the Cumulative Distribution 
Function (CDF)-scaling approach of Reichle and Koster (2004), where the observations are 
rescaled to the model’s climatology by matching the CDF of the observations to the CDF of the 
model soil moisture. The model CDF and observation CDF was computed using about 9 years of 
data separately for each grid point, from June 2002 to December 2011. 

In this way, the AMSR-E retrievals are transformed such that their climatology is comparable to 
the climatology for top layer soil moisture estimates produced by the CLSM. The 
climatologically rescaled AMSRE data are then introduced as observations to the EnKF using 
sequential observations of AMSRE and climatological data. Each analysis was completed for the 
TEXMEX (1 1 1.25°W-86.25°W; 18.75°N-41.25°N) and Horn of Africa (HOA; 21.25°E-51.25°E; 

II. 25°S-23.75°N) regions. 

2. GRACE Terrestrial Water Storage 

The NASA/German Gravity Recovery and Climate Experiment (GRACE; Tapley et al., 2004) 
satellite mission measures month-to-month changes in Earth’s gravity field, which can be used to 
infer changes in terrestrial water storage (TWS; the sum of groundwater, soil moisture, snow, 
ice, and surface waters). Challenges to using GRACE TWS data include their coarse spatial and 
temporal resolutions, their vertically integrated nature, and 2-5 month latency of the data 
products (Rodell et al., 2010). There is a trade-off between spatial resolution and accuracy, such 
that 150,000 km 2 is the approximate minimum area that can be resolved with a reasonable degree 
of confidence, although 1° x 1° resolution gridded TWS anomaly (deviation from the temporal 
mean) fields are distributed by NASA/JPL to facilitate delineation of a region of interest 
(Landerer and Swenson, 2012). 

To address these challenges, Zaitchik et al. (2008) developed an Ensemble Kalman smoother 
(EnKS) data assimilation scheme to integrate GRACE and other data within the Catchment Land 
Surface Model (CLSM). The scheme merges basin-scale, monthly, and GRACE-derived TWS 
anomalies into CLSM using information on the uncertainty in both the observations and the 
model. It enables spatial and temporal downscaling by distributing the innovation (the difference 
between the observation and the model estimate of a state) among smaller spatial elements and 
time steps, and also acts to disaggregate the information vertically among water storage 
components. Results are fields of groundwater, soil moisture, and snow variations, which 
combine the veracity of GRACE with the fine spatial and temporal resolutions of the model. 
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Zaitchik et al. (2008) and Houborg et al. (2012) demonstrated that groundwater storage 
variability in several regions of the U.S. was more accurate in the GRACE assimilation than in 
the CLSM open loop results. 

In an ensemble assimilation approach, conditional probability densities for predicted states are 
approximated by a finite number of model trajectories - the ensemble - with a covariance that 
reflects uncertainties in the model physics, parameters, and forcing data. Assimilation 
increments are calculated based on the relative uncertainty in the model and the observations, 
described by the (sample) error covariance matrices. The time-dependent Kalman gain matrix 
determines the relative weights of the model versus the observations during the update, and is 
defined on the basis of their respective covariance matrices. The error cross-covariance between 
the observed state and the model prediction is particularly important because it provides the basis 
for the distribution of information from the coarse scale observation to the finer scale model tiles. 
Since the cross-covariance matrix is diagnosed from the ensemble, the perturbations that are 
added to the forcings and state variables of each ensemble member must include realistic 
horizontal correlations. The ensemble update is computed separately for each GRACE 
observation. Prior to assimilation, the GRACE TWS anomalies are converted to absolute TWS 
values by adding the corresponding regional, time-mean water storage from the open loop 
portion of the CLSM simulation. 

For the current project, the GRACE data assimilation approach developed by Zaitchik et al. 
(2008) was modified to enable assimilation of the 1° x 1° resolution gridded TWS anomaly fields 
directly, without first averaging them over large river basins. While each 1° pixel is not 
meaningful on its own (hence the basin averaging), there is useful spatial information in the 
GRACE fields at sub-basin scales. By carefully setting the observational error within the data 
assimilation scheme, we are able to preserve that spatial information while avoiding errors 
caused by overconfidence in the pixel scale GRACE data. 

In the context of this project, the purpose of GRACE data assimilation is to improve the 
representation of soil moisture and groundwater in the model. These variables feed back to 
atmospheric processes, hence more accurate initialization of these fields in a coupled land- 
atmosphere forecast system should, all else being equal, increase forecast skill. 

3. Evaluations 

In this section, we evaluate the results from the Open Loop (OL), Soil Moisture Data 
Assimilation (DA-SM) and GRACE Terrestrial Water Storage Data Assimilation (DA-TWS) 
runs. Due to the lack of in situ data over HOA, the evaluation over that region focuses on 
remotely sensed evapotranspiration estimates, while over the TEXMEX domain we include both 
in situ and remotely sensed data in the evaluation. Note that these simulations do not represent 
forecasts; our evaluations of forecasted soil moisture come later. 
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Figure 13. In situ soil moisture observations in the TEXMEX domain. The SCAN stations measure profiles while the 
ARS CalVal stations include only surface measurements in 2 experimental watersheds. 

(a) TEXMEX 

The soil moisture estimates from the simulations are compared against two reference datasets: 
(1) surface soil moisture measurements from four USD A Agricultural Research Service (ARS) 
experimental watersheds (Jackson et al. 2010) (two of which are in the modeled domain) and (2) 
soil profile measurements from the USDA Soil Climate Analysis Network (SCAN; Schaefer et 
al. 2007). The stations in the SCAN network provide hourly soil moisture measurements at the 
soil profile depths of 5, 10, 20, 50 and 100 cm, wherever possible. A number of extensive quality 
control procedures were applied to the raw data from the SCAN sites, the details of which are 
described in Liu et al (2011). We employ this quality-controlled dataset in our evaluations. 
Figure 13 shows the locations of the ARS and SCAN stations employed in the evaluations. These 
sites reflect the locations which passed the quality control of the in-situ data and where adequate 
soil moisture observations were assimilated. 

Table 2 shows the comparison of the domain averaged anomaly correlation (R), anomaly root 
mean square error (RMSE) and unbiased RMSE metrics for the open loop and the DA 
integrations compared to the ARS and SCAN site data. The anomaly time series for each grid 
point is estimated by subtracting the monthly-mean climatology values from the daily average 
raw data, so that the anomalies represent the daily deviations from the mean seasonal cycle. (We 
thus do not measure the trivial skill associated with precipitation and evaporation seasonality.) 
The anomaly R and RMSE values are computed (separately at each grid point), as the correlation 
coefficient and RMSE between the daily anomalies from the assimilation estimates and the 
corresponding in-situ data, respectively. As the anomaly metrics are indifferent to any bias in the 
mean or amplitude of variations, the “unbiased” RMSE (ubRMSE), which is computed from the 
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time series after the removal of the long-term mean bias (Entekhabi et al. (2010), is also used as 
an evaluation metric. Due to the availability of SCAN and ARS datasets, the time period of Jan 
2003 and December 2011 is used to compute these error metrics. In the evaluations, surface soil 
moisture is defined as the top 10cm of the soil column and the root zone is defined as the soil 
moisture content of the top lm of the soil colu mn (derived as a suitably weighted vertical 
average over the model and observation layers). 


Table 2. Statistics of modeled soil moisture compared to in situ measurements at the SCAN and ARS sites. The model 
results shown include Open Loop (OL), Soil Moisture Data Assimilation (DA-SM) and GRACE Data Assimilation 
(DAGRACE). 


Surface (SCAN) 

OL 

DA-SM 

DA-TWS 

Anomaly R 

0.61 +/- 0.02 

0.62 +/- 0.02 

0.64 +/- 0.02 

Anomaly RMSE 

0.049 +/- 0.002 

0.048 +/- 0.002 

0.048 +/- 0.002 

Unbiased RMSE 

0.059 +/- 0.002 

0.058 +/- 0.002 

0.058 +/- 0.002 


Root zone (SCAN) 

OL 

DA-SM 

DA-TWS 

Anomaly R 

0.59 +/- 0.02 

0.61 +/- 0.02 

0.63 +/- 0.02 

Anomaly RMSE 

0.035 +/- 0.002 

0.035 +/- 0.002 

0.034 +/- 0.002 

Unbiased RMSE 

0.044 +/- 0.002 

0.043 +/- 0.002 

0.041 +/- 0.002 


Surface (ARS) 

OL 

DA-SM 

DA-TWS 

Anomaly R 

0.77 +/- 0.02 

0.75 +/- 0.02 

0.77 +/- 0.02 

Anomaly RMSE 

0.025 +/- 0.002 

0.027 +/- 0.002 

0.025 +/- 0.002 

Unbiased RMSE 

0.030 +/- 0.002 

0.031 +/- 0.002 

0.030 +/- 0.002 
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Table 2 indicates that both soil moisture and TWS assimilation helps in improving the soil 
moisture states. When compared to the SCAN measurements, the domain averaged open loop 
anomaly R is 0.61 and it provides a marginal improvement (not statistically significant) to 0.62 
with soil moisture DA. The TWS assimilation shows a larger and statistically improvement in 
the surface soil moisture fields with the anomaly R of 0.64. In case of root zone soil moisture 
comparisons, the trends are similar, with soil moisture DA and TWS DA providing domain- 
averaged anomaly R values of 0.61 and 0.63, respectively compared to the open loop anomaly R 
value of 0.59. Similar trends of improvements are observed with the Anomaly RMSE and 
Unbiased RMSE metrics. In the comparisons to ARS data, the soil moisture DA shows a 
marginal degradation (in anomaly R) and TWS DA does not show any improvement. Note that 
only two watersheds from the ARS network are included in this evaluation, whereas the SCAN 
evaluation includes 55 stations. 

The impact of data assimilation on the surface fluxes of latent and sensible heat are evaluated 
against four independent flux data products: (1) gridded FLUXNET data from the Max Plank 
Institute (MPI), which was created by synthesizing FLUXNET tower data with meteorological 
forcings and vegetation information from interpolated station and satellite data to produce a 
global, monthly, 0.5 degree resolution data product from 1982 to 2008 (Jung et al. 2009), (2) a 
global 1km ET estimate based on the MODIS satellite (Mu et al. 2011), (3) The Atmosphere- 
Land Exchange Inversion model (ALEXI) based flux products (Anderson et al. 2007), which are 
based on the thermal channel observations of the Geostationary Operational Environmental 
Satellites (GOES) and (4) a MODIS-based evapotranspiration product from the University of 
Washington (UW) (Tang et al. 2009). 

Figure 14 shows the domain averaged difference maps of RMSE from the soil moisture and 
TWS DA integrations from the evaluations against the four reference datasets. The difference 
maps are computed by subtracting the RMSE of the DA integration from the RMSE of the open 
loop integration. If the difference is positive, then DA integration improves the open loop 
estimates and on the other hand, if the difference is negative, the DA integration degrades the 
open loop estimates. 

Overall, degradations in the western part of the modeling domain are observed in the soil 
moisture DA comparisons. Most of the improvements from soil moisture DA are observed over 
the upper Mississippi basin. The changes in the ET fields due to TWS DA, on the other hand, is 
generally positive with improvements noted in Mexico and lower Mississippi basin. Figure 15 
shows a similar comparison of the sensible heat fluxes from DA integrations compared against 
the FLUXNET and ALEXI datasets. Generally both DA integrations provide improvements in 
sensible heat fluxes, especially over the Mississippi basin areas. 
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Figure 14: RMSE difference of latent heat flux estimates from soil moisture and TWS DA integrations compared against 
four reference datasets. The RMSE difference is computed as the RMSE of the open loop integration minus the RMSE of 
the DA integration. The blue (negative) colors indicate areas with degradation from DA and red (positive) colors indicate 
areas with improvements from DA. 
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Figure 15: RMSE difference of sensible heat flux estimates from soil moisture and TWS DA integrations compared 
against four reference datasets. The RMSE difference is computed as the RMSE of the open loop integration minus the 
RMSE of the DA integration. The blue (negative) colors indicate areas with degradation from DA and red (positive) 
colors indicate areas with improvements from DA. 

We conclude the evaluation of drought simulation in the TEXMEX region with a comparison 
against maps generated by the North American Drought Monitor (NADM). The plots on the left 
in Figure 16 show the percentiles of root zone soil moisture produced by the OL simulation 
during different months of 201 1. (Percentiles are based on the monthly states produced over the 
full 3 1 years of simulation.) The plots on the right come directly from the published NADM. 
The color bar used for the OL percentiles was chosen to agree with the “drought severity 
classification” utilized by the drought monitor (http ://drou ghtmonitor.unl . edu/ classify.htm) . 

The salient result in the plot is the generally successful tracking of the NADM-defined drought, 
through its growth to its peak and decay, by the soil moisture percentiles produced with the LIS 
system. The agreement is indeed remarkable given that the NADM characterization of the 
drought is based on a number of indices (a subjective integration of precipitation, streamflow, 
reports from the field, etc.), whereas the OL root zone moisture percentiles shown in the figure 
represent a straightforward and fully objective calculation. The agreement gives us confidence 
that we capture the essential character of drought with our modeling system. 
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Figure 16. Percentiles of root zone soil moisture as produced by the OL simulation (right column) versus drought 
estimates published by the North American Drought Monitor (left column) for different phases of the 2012 TEXMEX 
drought. 


The July 2011 root zone moisture percentiles produced in the OL simulations are compared to 
those produced in the DA-TWS simulation in Figure 17. While some small differences are seen, 
particularly in the neighborhood of Mississippi, the assimilation of GRACE TWS data has 
generally a small impact on estimated drought. 
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July 2011 soil moisture percentile, OL simulation 



July 2011 soil moisture percentile, DA-TWS simulation 



Figure 17: Impact of data assimilation (in particular, the assimilation of TWS estimates from GRACE) on the computed 
estimate of root zone moisture percentiles for July 2011. 


(b) HOA 

Due to the lack of adequate ancillary measurements over HOA, the evaluations are limited to 
comparisons against FLUXNET, MOD 16, NDVI, and FEWSNET-based datasets . 
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Figure 18: RMSE difference of latent heat flux estimates from soil moisture and TWS DA integrations compared against 
two reference datasets. The RMSE difference is computed as the RMSE of the open loop integration minus the RMSE of 
the DA integration. The blue (negative) colors indicate areas with degradation from DA and red (positive) colors indicate 
areas with improvements from DA. 


Similar to the results shown over the TEXMEX domain, the evaluation of the ET fields given in 
Figure 18 indicates degradations from soil moisture DA over several parts of the domain, with a 
few areas of improvement such as the Horn, areas near Uganda and Tanzania. The TWS 
improvements are mostly over the wetland region in Sudan known as the Sudd. 
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Figure 19: RMSE difference of sensible heat flux estimates from soil moisture and TWS DA integrations compared 
against FLUXNET data. The RMSE difference is computed as the RMSE of the open loop integration minus the RMSE 
of the DA integration. The blue (negative) colors indicate areas with degradation from DA and red (positive) colors 
indicate areas with improvements from DA. 


The evaluation of the sensible heat fluxes from DA integrations against the FLUXNET data is 
shown in Figure 19. In most parts of the domain, soil moisture DA provides improvements in 
the sensible heat flux fields with the TWS DA again providing improvements over the Sudd. 

Note that the reference ET datasets themselves have associated uncertainties, and our 
independent analysis (not shown) of the products showed that MOD 16 systematically 
underestimates fluxes and likely has higher uncertainties associated with it. The FLUXNET 
product, on the other hand is developed from sparse tower network data. 


In analogy to Figures 16 and 17 for the TEXMEX region, the root zone soil moisture percentiles 
for the HOA region are presented in Figure 20. The first colu mn shows the percentiles for April, 
the peak of the HOA drought. Africa does not have a useful equivalent of the NADM, so to 
evaluate these patterns, we turn to an NDVI map for May (second colu mn of Figure 20). A 
negative NDVI anomaly is indicative of reduced vegetation coverage and/or lushness, which in 
turn serves as a useful indicator of low soil moisture in the weeks leading up to the measurement. 
The large negative anomaly inside the blue circle generally agrees with the low soil moisture 
percentiles produced there by both the OL and DA-TWS simulations in April. Note that 
simulated soil moisture percentiles are especially affected by GRACE data assimilation in the 
area highlighted by the green circle. 
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The third column of Figure 20 shows the simulated percentiles of root zone moisture in the HO A 
region for the April- June period, the peak of the growing season in Somalia. Results are in fact 
shown only in a subset of the region, that area for which we have data on crop water requirement 
satisfaction index (WRSI) from the Famine Early Warning Systems Network (FEWS-NET) 
(fourth column). The crop WRSI represents the amount of soil water available during the 
growing season of the major crop — in this case maize — where deficits represented by an index 
value less than 50 represent crop failure. The model simulations accurately capture the failure of 
crops in the area enclosed by the magenta circle. The simulations show dryness toward the west 
where crops were successful; perhaps this is because agriculture toward the west (unlike in 
Somalia) is dominated by production in other seasons. 


April 2011 SM NDYI anomaly (veg. 
percentile , OL health) , early May 2011 


Growing season (AMJ) 
SM percentile, OL 
(regions with no obs 
masked out) 




USGS/EROS 



0 3 6 11 21 31 



April 2011 SM 
percentile, DA- 
TWS 


Growing season (AMJ) SM 
percentile , D A-TW S 



Crop WRSI, grains 
Nov. 2011 





<50 Failure 
50-69 Poor 
60-79 Mediocre 
80-04 Average 
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100 Very Good 
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Figure 20: First column: root zone soil moisture percentiles for April as produced in the OL simulation (top) and the DA- 
TWS simulation (bottom). Second column: NDVI anomaly for the first part of May. Third column: root zone soil 
moisture percentiles for the April- June period as produced in the OL simulation (top) and the DA-TWS simulation 
(bottom). Fourth column: end-of-year estimates of crop failure in the HOA region, as estimated by the FEWS-NET 
WRSI model. For ease of comparison, masks are applied in the third column to mimic the regions of data availability 
shown in the fourth column. The circled areas are discussed in the text. 
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D. GEOS-5 Forecast Experiments 


1. Experiment Setup 

The HOA and TEXMEX regions exhibited severe drought conditions in the year 2011. The 
following three sets of LIS7 experiments were performed to examine the drought prediction skill 
obtained by utilizing different land surface initial conditions with GEOS5 forecasts: (i) LIS7 
forecasts initialized with the open loop runs described in Section II.B. (OL), (ii) LIS7 forecasts 
initialized after performing soil moisture data assimilation from Advanced Microwave Scanning 
Radiometer-Earth (AMSR-E: Owe et al. 2008) Observing System’s/Land Parameter Retrieval 
Model (LPRM) from 2002-2011 (DA-SM, described above) and (iii) LIS7 forecasts initialized 
with after performing terrestrial water storage data assimilation based on the Gravity Recovery 
And Climate Experiment (GRACE: Houborg et al. 2012) from 2003-2011 (DA-TWS, described 
above). 

Based on our assessment of GEOS-5 prediction skill, we determined that one- to three-month 
forecasts would be the limit at which we would expect reliable drought forecasts. Therefore, 
one- to three- month forecasts were performed in all the experiments to target each month in the 
year 2011 up to 3 months. For example, to have January 2011 as the 3 rd -month target- forecast, 
the experiments were performed by initializing LIS7 at the beginning of November 2010. Figure 
19 illustrates the overview of the forecasted months during 2010 and 2011 in each experiment. 

(a) Meteorological forcing data 

The meteorological forcing data in conducting LIS7/CLSM-f2.5 retrospective forecasts up to 9- 
lead months were extracted from GEOS5 forecasts. The meteorological data used to force the 
land component of GEOS5 (CLSM) during the production of forecasts were archived at 
1.25°longitude x 1° latitude resolution and at half hour frequency (Table 3). The total 
precipitation-forcing fields from the GEOS5 forecasts were found to be different (mostly higher) 
from the MERRA-Land precipitation for all the seasons. To correct this bias, new precipitation 
forcing is estimated by CDF matching (Reichle and Koster, 2004) wherein the precipitation 
climatology of each ensemble member of GEOS5 forecasts is rescaled to match the climatology 
of MERRA-Land precipitation. 

(b) LIS7 experiments 

The meteorological forcing from GEOS-5 that was at a resolution of 1.25° x l° and 30-min 
frequency is bilinearly interpolated in space and time to 0.25° and 15-min, respectively, to 
perform the LIS7/CLSM-f2.5 experiments. The LIS7/CLSM-f2.5 experiments (OL, DA-SM, and 
DA-TWS) differ from each other only in their initial conditions. The initial conditions for the 
‘OL’ forecasts were extracted from the LIS7/CLSM-f2.5 experiment OL run covering 1981 to 
2011. As previously described, the initial conditions for the OL run were produced with a 62- 
year spin-up of the LIS7/CLSM-f2.5 system. The initial conditions for the DA-SM and DA-TWS 
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forecasts were obtained from experiments that were performed by assimilating AMSR-E/LPRM 
soil-moisture and GRACE TWS in OL runs, respectively. 


2010 NOV 

2010 DEC 

2011 JAN 
2011 FEB 
2011 MAR 
2011 APR 
2011 MAY 
2011 JUN 



Figure 21. The red solid arrows show the 3-month forecasts by initializing at the month indicated at the beginning of the 
arrow. Various initialized months and forecast lengths are so chosen to target-forecast each month in the year 2011 up to 
3 months. 

Note that the catchment parameters used in the LIS7/CLSM-f2.5 experiments (OL, DA-SM and 
DA-TWS) differ from those used in the GEOS5/CLSM forecasts. The following differences are 
found between the LIS7/CLSM and GEOS5/CLSM forecasts: 

(1) The horizontal spatial resolution of the domain for LIS7/CLSM-f2.5 is 0.25°, whereas the 
GEOS5/ CLSM-f2.5 simulations were performed at 1.25° xl° horizontal spatial 
resolution. 

(2) The time step used in the LIS7/CLSM-f2.5 simulations is 15 minutes, whereas a 30- 
minute time step was used in the GEOS5/ CLSM-f2.5 simulations. 

(3) The turbulence scheme used in LIS7 is the Louis scheme (Louis 1979), whereas GEOS5 
used the Helfand scheme (Helfand and Labraga 1988; Helfand et al. 1999). 

(4) CLSM-f2.5 simulations were performed on tile space in GEOS5 and on grid space in 
LIS7. 

(5) The bedrock depths are deeper by 2m in the LIS7/CLSM-L2 experiments. Some 
catchment parameters (e.g., total water holding capacity) are altered by this change. 
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Table 3. The meteorological forcing data from GEOS5 used in LIS7 simulations. 

GEOS5 forcing data 


1. Near Surface Air Temperature (K) 

2. Near Surface Specific Humidity (Kg Kg’ 1 ) 

_2 

3. Incident Shortwave Radiation (W m ) 

_2 

4. Incident Longwave Radiation (W m ) 

5. Eastward Wind (ms’ 1 ) 

6. Northward Wind (ms’ 1 ) 

7. Surface Pressure (Pa) 

8. Rainfall Rate (Kg m’ 2 s’ 1 ) 

9. Snowfall Rate (Kg m’ 2 s’ 1 ) 

2 1 

10. Convective Rainfall Rate (Kg m s ) 

1 1 . Height of Forcing Variables (m) 

12. Photosynthetically Active Direct Radiation (W m’ ) 

2 

13. Photosynthetically Active Diffuse Radiation (W m ) 

14. Net Shortwave Radiation at the Surface (W m~ 2 ) 


2. Evaluations 


(a) Soil Moisture 

Spatial maps of the LIS7/CLSM root zone soil moisture forecasts in the HOA region for the peak 
of the drought period are presented in Figure 20 (first three columns, each column representing a 
different forecast start date). The percentiles shown can be compared to those from the OL 
simulation (fourth column) and to the aforementioned NDVI data (fifth column). The forecast 
system does predict dry conditions for April in many parts of the HOA region, even when 
initialized at the beginning of February. Only the forecast initialized at the beginning of April, 
however, captures reasonably well the precise locations of many of the anomalies seen in the OL 
simulation and in the NDVI data. 

While an improvement in forecast skill with a reduction in lead-time is not a surprise, it is also 
worth emphasizing here the nature of an ensemble forecast. The forecast results in Figure 21 
represent an average over seven separate soil moisture forecasts (based on an ensemble of seven 
meteorological forecasts from GEOS5), each reflecting a possible trajectory of soil moisture 
evolution over the HOA region. Nature, in contrast, provides only one trajectory. Shown in 
Figure 21 are seven soil moisture forecasts for April, each one initialized at the beginning of 
March. The two forecast maps with the overlain blue circles indicate the key point - for the 
March 1 initialization, some of the ensemble members capture the dryness seen in the 
observations, whereas others do not. Taking the ensemble mean “washes out” the extreme 
signals produced by some ensemble members. While this is an unavoidable facet of ensemble 
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forecasting, because we have no way of knowing a priori which ensemble member will be the 
most accurate, the existence of the desired signal amongst the ensemble should be kept in mind. 
(Note: in all of the plots featuring ensemble averages, the percentiles are defined based on a 
ranking of the ensemble mean forecasts; thus, dry extremes can show up for the ensemble mean 
even given the averaging process.) 

Figure 22 provides a summary of the results. The ensemble mean forecast of April root zone 
moisture was averaged over the area enclosed by the red square, and the resulting large-scale 
forecasts (in terms of percentiles for the ensemble averages) are shown for different start dates 
and different lead times in the final three columns (red lines). Shown for comparison are the 
corresponding OL simulation results (black lines); while these cannot be interpreted as “truth”, 
because they are products of the same model, they are nevertheless presumably close enough to 
the truth to be useful. Of particular interest are the results in the bottom row, which corroborate 
some of the findings already discussed - the model is able to predict drier conditions for April in 
the indicated region even as early as February. The model also predicts well the wet conditions 
seen in mid-2010 (top row). 
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Figure 22: Columns 1-3: Root zone soil moisture forecasts for different forecast initialization dates, expressed as 
percentiles. The April forecasts (lowest panel in each of these columns) can be compared to the OL simulation results 
(Column 4) and the NDVI data (Column 5). 
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Figure 23: Root zone soil moisture forecasts for April (expressed as percentiles) as produced by the seven different 
forecast ensemble members, each initialized on March 1. The three panels in the lower right show, respectively, the 
ensemble mean forecast for April, the April percentiles produced in the OL simulation, and the observed NDVI anomaly 
for early May. 


37 



Drought Prediction for Socio-Cultural Stability Project 


NDVI anomaly (veg. 
health) , early May 2011 



Initialized on May 1, 2010 Initialized on June 1, 2010 Initialized on July 1 , 2010 



Initialized on February 1,2011 



Feb Mar Apr 


Initialized on March 1 , 201 1 Initialized on April 1.2011 



Figure 24: Forecasted root zone soil moisture percentiles for the indicated region (bounded by the red square) for 
different start dates (different panels) and lead times (x-axis within each panel). The forecasts are shown as red lines; an 
estimate of the true percentiles, as derived from the OL simulation, are shown in black lines. 


(b) Human Migration 


Simulated soil moisture at 0.25°x0.25° can be analyzed alongside estimates of human migration. 
The human migration dataset was derived by our project partners at CRREL based on the 
Population Movement Trends (PMT) portal (UNHCR, 2013) and provides drought-induced 
population movement datasets reported by the source district of the displacement, at monthly 
intervals from January 2008 to September 2012. The resolution and location of gridded GEOS-5 
forecast products required a method of assigning grid cells to the geographic units of interest in a 
given statistical analysis. In the case of the PMT datasets, these units of interest are Somali 
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administrative units at the district level. Because the spatial resolutions of the model grid cells 
are similar to those of Somali administrative districts, grid cells were assigned to a specific 
district when at least half of the area of the grid cell fell within that district’s boundary. 
Subsequently, the seven districts that were not large enough to intersect more than half of any 
one grid cell were manually assigned one or more grid cells that best represented the spatial 
extent of the district. Here we divide the IDP (internally displaced persons) count by the local 
population density and thus work with normalized migration (human displacement) data. 

A caveat is necessary here. We do not claim to provide a predictive equation for human 
migration based on our simulations - the available data for migration (4 years) are simply too 
limited for adequate statistical analysis. The correlations shown below should be considered as 
qualitative rather than quantitative, and they are not necessarily indicative of a causal 
relationship. That said, the correlations do suggest that a predictive equation may exist and 
could be established, given the collection of more data. 

We computed correlations, for three different averaging periods (1 months, 2 months, and 3 
months), between normalized migration and four quantities: 

1) The root zone soil moisture during the month prior to the averaging period, as determined 
from the open loop simulation (i.e., using the prior month’s moisture as the predictor of 
migration). 

2) Same as (1), but as determined from the simulation using GRACE data assimilation. 

3) The root zone soil moisture for the averaging period produced during the open loop simulation 
(i.e., using a “best possible” estimate of moisture as the predictor, in the absence of assimilation.) 

4) Same as (3), but as produced in the simulation using GRACE data assimilation. 

In all cases, a mean seasonal cycle is subtracted from the soil moisture and migration data prior 
to the correlation calculations. This seasonal cycle is necessarily very crude, given the 4-yr 
period examined. Note that correlations with the 3 rd and 4 th quantities above are presented for 
reference only; these soil moisture quantities, which rely on known precipitation and/or GRACE 
measurements during the forecast period, are not true forecasts. 

Figure 25 shows the results. Negative correlations (blue colors) indicate a migration of people 
away from an area during times of lower-than-average soil moisture (i.e., drought-induced 
migration). Several features of Figure 25 stand out: (i) the blue colors generally dominate the 
yellow colors, suggesting a general correlation in the expected direction; (ii) The correlations are 
nevertheless generally small, though some larger values appear in certain areas; (iii) The 
correlations do not improve on average when the “correct” precipitation is applied (3 rd and 4 th 
columns), though they do improve significantly in some locations; (iv) Correlations are smallest 
for the 1 -month averaging period; and (v) In general, the correlations are not generally stronger 
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when GRACE data are used rather than the open loop data. Again, all of these features are 
suggestive rather than conclusive; a proper statistical analysis would require substantially more 
data. 
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Figure 25. First column: correlation between a soil moisture predictor (the soil moisture in the month prior to the 
averaging period, from the OL simulation) and the normalized human displacement during the averaging period. 
Negative correlations are consistent with the idea that drought induces migration. Second column: Same, but for the soil 
moisture predictor taken from the DA-TWS simulation. Third column: same, but for the soil moisture predictor set to 
the soil moisture produced by the OL simulation during the averaging period (i.e., not representing a forecast). Fourth 
column: same, but for the soil moisture predictor set to the soil moisture produced by the DA-TWS simulation during the 
averaging period (again, not a forecast). 
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E. Summary 


This work set out to answer the question: “Can existing, linked infrastructures be used to predict 
the onset of drought months in advance?” Based on our work, the answer to this question is 
“yes” with the qualifiers that skill depends on both lead-time and location, and especially with 
the associated teleconnections (e.g., ENSO, Indian Ocean Dipole) active in a given 
region/season. 

As part of this work, we successfully developed a prototype drought early warning system based 
on existing/mature NASA Earth science components including the Goddard Earth Observing 
System Data Assimilation System Version 5 (GEOS-5) forecasting model, the Land Information 
System (LIS) land data assimilation software framework, the Catchment Land Surface Model 
(CLSM), remotely sensed terrestrial water storage from the Gravity Recovery and Climate 
Experiment (GRACE) and remotely sensed soil moisture products from the Aqua/ Advanced 
Microwave Scanning Radiometer - EOS (AMSR-E). We focused on a single drought year — 
2011 — during which major agricultural droughts occurred with devastating impacts in the Texas- 
Mexico region of North America and the Horn of Africa. 

Our results demonstrate that GEOS-5 precipitation forecasts show skill globally at 1 -month lead, 
and can show up to 3 months skill regionally in the TEXMEX and HOA areas. Our results also 
demonstrate that the CLSM soil moisture percentiles are a good indicator of drought, as 
compared to the North American Drought Monitor for TEXMEX and a combination of FEWS- 
NET data and MODIS NDVI anomalies over HOA. 

The data assimilation experiments produced mixed results. Neither soil moisture nor 
evaporation was significantly improved after assimilating AMSR-E soil moisture products. 
Hence, soil moisture assimilation was not found to significantly impact the ability of CLSM to 
monitor drought, as expressed via soil moisture percentiles. In contrast, the GRACE terrestrial 
water storage (TWS) assimilation was found to significantly improve soil moisture and 
evapotranspiration, as well as drought monitoring via soil moisture percentiles. 

We carried out 1-3 month lead time forecast experiments using archived and properly rescaled 
GEOS-5 forecasts as input to LIS/CLSM with the three uncoupled simulations (OL, DA-SM, 
and DA-TWS) as initial conditions. Based on these forecast experiments, we find that the 
differences between the OL, DA-SM and DA-TWS initial conditions are not significant, but that 
the expected skill in GEOS-5 forecasts from 1-3 months is present in the soil moisture 
percentiles used to indicate drought. In the case of the HOA drought, the failure of the long rains 
in April appears in the February 1, March 1 and April 1 initialized forecasts, suggesting that for 
this case, drought forecasting would have provided some advance warning about the drought 
conditions observed in 201 1. 
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F. Recommendations 


Drought forecasting skill is unique to each model, region and season, and requires an 
understanding of the teleconnections that lead to predictable patterns of temperature and 
precipitation over a given area. This study was focused on developing a prototype drought 
forecasting system based on existing NASA science. While the initial results are promising, the 
system was evaluated for a single set of droughts observed in 201 1 over two regions: the Texas- 
Mexico drought over North America and the Horn of Africa drought. Hence, our first 
recommendation is to carry out a comprehensive analysis of droughts observed over the entire 
period of record for GEOS-5 forecasts. 

A key finding of this work is that the ability of GEOS-5 to capture the Indian Ocean Dipole can 
lead to 1-3 month predictions of drought in the Horn of Africa. As noted in the discussion on 
GEOS-5, these teleconnections can lead to skill in both the long rains and short rains, with 
GEOS-5 suggesting a higher probability of predicting anomalies in the short rains. Another 
recommendation would be to continue to analyze the GEOS-5 forecasts in HOA stratifying by 
anomalies in long and short rains to better quantify the skill for these key seasonal cycles of 
rainfall and crop production for the HOA region. 

Finally, the hypothesis that the GRACE Terrestrial Water Storage data assimilation and the 
Aqua/AMSR-E soil moisture data assimilation would improve drought prediction by providing 
better initial conditions was not supported by our results. We did show that GRACE TWS 
assimilation improves soil moisture and potentially drought monitoring. However, a single year 
of analysis is inadequate to fully demonstrate whether this information provides a benefit or not. 
Newer sensors such as the ESA Soil Moisture/Ocean Salinity (SMOS) and the upcoming NASA 
Soil Moisture Active/Passive (SMAP) mission to be launched in Fall 2014 should provide deeper 
soil moisture information that should be valuable. Therefore, a final recommendation is to 
continue to include GRACE TWS and SMOS/SMAP soil moisture products in a routine activity 
building on this prototype to further quantify the benefits for drought assessment and prediction. 
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